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This Letter applies the concept of 'jets', as constructed from calorimeter cell 4-vectors, to jets 
composed (primarily) of photons (or leptons). Thus jets become a superset of both traditional ob- 
jects such as QCD-jets, photons, and electrons, and more unconventional objects such as photon-jets 
and electron-jets, defined as collinear photons and electrons, respectively. Since standard objects 
such as single photons become a subset of jets in this approach, standard jet substructure tech- 
niques are incorporated into the photon finder toolbox. We demonstrate that, for a single photon 
identification efficiency of 80% or above, the use of jet substructure techniques reduces the number 
of QCD-jets faking photons by factors of 2.5 to 4. Depending on the topology of the photon-jets, 
the substructure variables reduce the number of photon-jets faking single photons by factors of 10 
to 10 3 at a single photon identification efficiency of 80%. 



The final states in a collider experiment are charac- 
terized in terms of a handful of objects. The detectors 
are designed to detect photons, electrons, muons, and 
a small number of hadrons (mostly charged pions) be- 
cause these are the only stable objects in the Standard 
Model, apart from neutrinos that exit undetected. Con- 
verting and associating the various signals in different 
parts of the detector to the familiar physics objects is a 
non-trivial challenge. In this Letter, we discuss the class 
of objects that dominantly deposit their energy in the 
high density materials of the calorimeter component of a 
detector. Photons and electrons are absorbed in the inner 
part of the calorimeter (the electromagnetic calorimeter 
or ECal) while the hadrons are absorbed in the outer part 
(the hadronic calorimeter or HCal). 

It is important to note that the content of the final 
state evolves as it moves out from the interaction point. 
At very short times and distances (typically less than 
10~ 15 m) the final state consists of leptons, photons, 
and partons. The color charged partons rapidly radiate 
more (largely collinear) partons forming showers of par- 
tons, and subsequently get organized into showers of color 
neutral hadrons. Most of these hadrons decay before or 
within the detector into lower mass hadrons, photons, 
and leptons. Consequently, photons and leptons can be 
part of a QCD-shower. Typically, energy deposits in the 
ECal are identified as isolated photons/electrons (i.e., not 
associated with a QCD shower) if they satisfy various iso- 
lation and shower-shape criteria. The remainder of the 
energy deposited in the ECal and HCal is clustered to- 
gether using a specific 'jet algorithm', to construct jets. 
A small fraction of these jets are tagged as arising from 
the hadronic decays of r leptons, based on another set of 
isolation and shape variables, and are removed from the 
list of jets. An event, therefore, is primarily classified in 
terms of the number of isolated photon/lcptons and jets 
observed, along with their kinematic properties. 

As suggested above, jets are often interpreted as the 
experimental 'footprints' of single energetic partons pro- 
duced in hard scattering events. A more sophisticated 



analysis reveals that such associations are naive: the jets 
identified using a typical jet algorithm will always con- 
tain contributions from the color-connected (but kine- 
matically uncorrelated) soft component of the same hard 
collision (the 'underlying event' or UE) and (at high lu- 
minosity) from truly uncorrelated but essentially simul- 
taneous collisions of other beam particles ('Pile-Up' or 
PU). Moreover, jets often contain the showers arising 
from more than one energetic parton. The jet-parton 
mapping breaks down further when we consider photon- 
jets [IrE] or electron-jets [IHZ] that fail to be identified 
as isolated photons or electrons and are accepted as jets. 
More importantly, if the photons inside the photon-jets 
are highly collimated, they may fake single photons. If 
the rate at which these photon-jets pass the detector def- 
inition of photons is large, the measurements performed 
interpreting the detected calorimeter objects as single 
photons become unreliable. 

The issues raised above are extremely important in the 
context of Higgs physics. There are new physics scenar- 
ios where the Higgs particle decays into photon-jets at 
a rate comparable to, or even larger than, its decay to 
single photons [TJ [3]. The precise measurement of the 
h — > 77 rate requires a clean separation of photons from 
photon-jets (as well as from QCD-jets). At the same 
time, we need a procedure that clearly distinguishes the 
photon-jets (and also electron-jets) from QCD-jets, since 
these photon-jet decay modes for the Higgs can provide 
signatures of physics Beyond the Standard Model [5HTD] . 
In other words, it is essential to extend the list of de- 
tectable/identifiable objects to include photon-jets and 
electron-jets with reliable separation from single pho- 
tons/electrons and from QCD-jets. 

In this Letter we propose such a formalism. The key 
ingredient is that we take 'jets', defined as the output 
of a standard (IR safe) jet algorithm, to be the common 
construct for all physics objects that deposit energy in 
the calorimeters. A subsequent analysis of these jets, 
especially using recently defined jet substructure vari- 
ables [TlTU6| . allows the jets to be identified and asso- 



dated with the appropriate physics objects. 

Note that we draw a clear distinction between the ter- 
minology of 'jets' and 'QCD-jets' in this Letter. We 
define 'jets' as the output of jet algorithms such as 
anti-fe T Q2], k T PUDS], or C/A [2DH22], which, in some 
instances, may have nothing to do with the usual QCD 
partons. A jet, therefore, is a generic concept that is de- 
fined in terms of the energy deposited in calorimeter cells 
and identified by a jet algorithm. With this definition 
a QCD-jet is simply a special kind of jet, as is a pho- 
ton/electron or any other conventional/unconventional 
calorimeter based object. 

To distinguish jets of various kinds, we take a multi- 
variate approach. We use a set of observables to train a 
boosted decision tree (BDT) 23J to optimize separation. 
The conventional variables that are often used to distin- 
guish a photon/electron from QCD-jets [2H [5S] can be 
applied in our jet-based formalism without compromising 
their efficiency. The additional power of our formalism 
arises from including jet substructure variables. 

Before proceeding, we summarize the advantages of us- 
ing jets as the fundamental objects. First, jets provide a 
unifying language for all calorimeter objects, which elim- 
inates the previous need to use different constructions for 
QCD-jets and photons/electrons. Second, jet substruc- 
ture based observables provide additional power for dis- 
criminating among the various kinds of jets. Finally, per- 
forming a jet substructure based analysis on objects such 
as single photons/electrons and also photon/electron- 
jets, means that grooming techniques (such as filter- 
ing [HI ES [27] , pruning EE1[29], trimming [30]), devel- 
oped mainly in the context of QCD-jets, can now be ap- 
plied to these objects. Such grooming serves to reduce 
contributions from the UE and PU [3TI 132] . 

The efficacy of the above approach will be demon- 
strated through explicit examples from Higgs physics. 
We consider three kinds of events: events with pp — > 
h + X — > 77 + A, events with pp — > h + X — > 
2 photon-jets + X, and finally, QCD dijct events. These 
events provide us with samples of single photons (i.e., 
jets dominated by single photons), photon-jets (jets con- 
taining several energetic photons), and QCD-jets. Here 
we concentrate our discussion on the extraction of sin- 
gle photon samples, minimizing the backgrounds due to 
QCD-jets as and photon-jets. The analysis based on 
conventional variables shows substantial separation be- 
tween photons and QCD-jets, but fails to separate pho- 
tons from photon-jets. The jet substructure variables, 
when used along with the conventional variables, provide 
further separation between single photons and QCD-jets. 
This enhanced analysis can separate single photons from 
photon-jets, photon-jets from QCD-jets, and even offers 
the possibility of determining details of any new physics 
scenario that leads to such photon-jets. In this Letter we 
show only the final results of the multivariable analyses 
and discuss photon-jets of just two particular topologies. 



A more exhaustive study of the many different discrimi- 
nating variables along with analyses comparing photon- 
jets of varied topologies will be presented elsewhere [33] . 

In the rest of the paper we present brief descriptions of 
the discriminating variables and simulation details, fol- 
lowed by a summary of our results. These results demon- 
strate how well single photons, photon-jets and QCD-jets 
can be differentiated from each other and also quantify 
the role played by the jet substructure variables. 

We use two conventional variables that play essential 
roles in separating photon/electrons from QCD-jets. 
Hadronic Energy Fraction (9j): The most powerful 
observable for discriminating a photon from a QCD-jet 
stems from the fact that a QCD-jet almost always de- 
posits some energy in the HCal. A QCD-jet consists 
of mostly pions and, on average, 2/3 of these pions are 
charged. The charged pions lose most of their energy 
in the HCal. The HCal isolation criterion exploits the 
feature that, for a photon to be isolated, the energy de- 
posited in the HCal (within a cone about the direction 
of the photon and of a given size) must be significantly 
smaller than the energy of the photon-candidate itself. It 
is straightforward to implement the isolation criterion in 
terms of the 'Hadronic Energy Fraction', defined as the 
fraction of the total jet energy deposited in the HCal, 
6j = EjnQ a i/Ej. In the analysis described below all 
included jets are required to pass a cut 9j < 0.25, which 
eliminates [33J about 98% of the QCD-jets but keeps 
about 94% of the single photons and photon-jets. 
Number of Hard Tracks (vj): Charged particles leave 
tracks in the Tracker portion of the detector, where they 
bend due to the presence of a magnetic field allowing a 
measurement of their momenta and charges. We count 
the number of charged particles with p? > 2 GeV present 
in the jet, which we label Vj. This variable can discrim- 
inate photons and photon-jets (characterized by vj = 
if photons do not convert) from QCD-jets, which often 
contain a large number of charged pions. Operationally, 
we define a charged particle to be "present in the jet", 
if a light-like and arbitrarily soft four-vector, having the 
same ij, <p as the charged particle, is clustered into the jet 
when we apply the jet algorithm to the original calorime- 
ter cell four-vectors plus these new soft four-vectors. 

We do not include a 'calorimetric isolation variable', 
defined as the fraction of energy deposited in the outer 
annulus of an inner cone for a given jet. Independent of 
the radius of the inner cone, using a calorimetric isolation 
variable along with 9j and uj, further reduces the QCD- 
jet fake rate at most by order 10-20%, and fails to reduce 
photon-jets faking photons. Often observables based on 
shower-evolution or particle-flow inside the detector are 
used to discriminate photons/electrons from QCD-jets. 
While we do not include these variables in the current 
work, we do not foresee any difficulty in using such vari- 
ables in the context of the jet analysis described here. 

The rest of the variables we use are constructed us- 
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ing exclusive subjets of jets. The (calorimeter cell) con- 
stituents of a given jet, identified by the jet finding algo- 
rithm, are (re) clustered using the C/A or kx algorithm 
until there remain exactly N four- vectors, i.e., the reclus- 
tering is halted by the constraint on N, not the algo- 
rithm parameter. These are the N exclusive (C/A or 
fcr)-subjets of the given jet. 

Nsubjettiness (tjv): iV-subjettiness, as introduced in 
Ref. 34 36 , provides a simple way to effectively count 
the number of energetic subjets within a given jet, and 
hence to discriminate among jets with varied energy 
flows. For a given jet and its N exclusive fcr-subjets 
we evaluate the V-subjettiness using the expression [34]. 

Y, k PT k x min{Ai?i. fe ,Ai?i, fc ,--- ,AR N>k } 

EfePT fc x R 

where k runs over all the constituents of a jet, ARi & = 
^/(Arii k) 2 + (A/pi fc) 2 , is the angular distance between 
the Z-th subjet and the k-th constituent of the jet, and R 
is the characteristic jet radius used in the jet clustering 
algorithm. For a jet with No actual energetic subjets, 
the value of tn will be substantially larger for N < No 
than for N > No, allowing a 'measurement' of No. 
Subjet distributions in a 'filtered' jet: We consider 
5 exclusive subjets for a given jet and, out of these, only 
use the 3 largest pr subjets to construct the observables 
defined in Eq. Note that, by discarding the 2 soft- 
est subjets, we have performed a version of 'grooming' 
typically labeled filtering. This ensures that our results 
are relatively insensitive to the effects of the UE and PU. 
We use the following four variables to quantify how the 
leading subjets are distributed inside the jet. 

A, = log(l-g^), ej = ^-5>%, 

(2) 

pj = -Y, AR ^> s J = A]z2 A i- 

i>3 i 

In these equations we use the following definitions: ptj, 
Ej, A j are the transverse momentum, energy, and active 
area [37] of the given jet; Ej, and A t are the energy and 
active area of the i-th subjet; px L is the px of the lead- 
ing subjet; and ARij is the angular distance between 
the i and j-th subjet. The variable Aj characterizes the 
fraction of jet pt carried by the leading subjet. The vari- 
able ej encodes information about how the jet's energy is 
shared among the subjets. The geometric observable pj 
carries information on the spatial distribution of subjets 
inside the jet, while 6j characterizes the 'cleanliness' of 
the jet. In the spirit of Ref. [38], we use both kr and 
C/A subjets to calculate the variables in Eq. |2]). Also, 
these observables depend on how we select the subjets. 
We find that the choice of "3 out of 5" for filtering to be 
optimal for separating photons from photon-jets with a 
range of photon-jet topologies. 



In order to minimize the background fake rate for a 
given signal acceptance, we include all the variables de- 
scribed above in BDTs as implemented in the Toolkit 
for Multivariate Analysis [35]. Given a signal and back- 
ground we construct three separate BDTs, each opti- 
mized using the following three sets of variables: 

D = (log Oj,vj, log ti, — ,— , — , 
I- Ti r 2 r 3 

(Aj,ej,pj,<5j)| c/A , (Aj,ej,pj)| feT | , (3) 
D c = {logOj,vj} , and D s = D - D c , 

where the subscript C/A or &t means that the observ- 
ables are calculated using C/A or kr subjets. The sets 
Dq and Ds consist of conventional and jet substructure 
variables respectively. D is the set of all variables. 

We generate all events with Pythia 8 [ID] . For photon- 
jets we set up a model in MadGraph 5 [H] . where the 
Higgs particle decays to a pair of new light scalars (m) 
of mass mi. We simulate photon-jets with two pho- 
tons by allowing the decay n\ — > 77. For photon- 
jets with four photons we force the n\ to decay via 
n\ — > 122 (— > 77) ri2 (— > 77), where 11,2 is a second scalar 
with mass m,2- In this work we set the Higgs mass at 
120 GeV; mj = 1 GeV to simulate 2 photon photon- 
jets; and mi = 5 GeV, 7712 = 1 GeV for photon-jets with 
4 photons. These choices of parameters ensure that the 
decay products of the n\ are highly collimated and are 
usually contained in a single jet. We use the default 
scheme for the UE as implemented in Pythia 8 to simu- 
late an appropriately busy hadronic environment. 

To simulate a (reasonably) realistic calorimeter the 
photons, electrons, and hadrons in a Pythia event are 
grouped into ECal cells of size 0.025 x 0.025, and HCal 
cells of size 0.1 x 0.1 in the (r)-(f>) plane. We incorpo- 
rate aspects of transverse showering for photons inside 
the ECal as well as calorimeter energy smearing for both 
the ECal and the HCal. We also simulate the conversion 
for photons into e + e~ pairs. Note, however, that we do 
not include effects of a magnetic field inside the detector. 
Using the total energy deposited in a cell and its (77, <f)) co- 
ordinates we construct light-like momentum four-vectors 
for each cell. These four-vectors, corresponding to the 
ECal and HCal cells, contribute to the analysis only if 
they pass the energy threshold of 0.1 GeV (ECal) and 
0.5 GeV (HCal). We use the anti-fc^ algorithm as im- 
plemented in Fast Jet [12] to cluster the calorimeter cells 
into jets with R = 0.4. Only the leading pt jet, with 
Pt > 50 GeV, from each event is used in the analysis. 

In this Letter we report our results for three separate 
questions, (i) With single photons treated as the signal, 
we determine how well we can reduce the rate at which 
QCD-jets fake single photons, (ii) We perform the same 
analysis treating photon-jets as the background to single 
photons. (Hi) Finally, we seek to separate single photons 
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FIG. 1. The plots in the left column show the background 
fake rate T versus the single photon acceptance A, where 
the solid (dotted) lines corresponds to BDTs using the full 
D variable set (Dc set only). The right panel indicates the 
extra suppression of the fake rate arising from including the 
jet substructure variables. In these figures, the red, maroon, 
and blue colored curves designate the cases when the back- 
ground is due to QCD-jets, 2 photon photon-jets, and 4 pho- 
ton photon-jets, respectively. 

from photon-jets, while, at the same time, attempting to 
keep QCD-jets from faking either of these. 

In Fig. [I] (left panels), we display the results for the 
fake rate (T) versus the acceptance (A) for single pho- 
tons treating either QCD-jets (top-row) or photon-jets 
(bottom-row) as the background. In the right panels we 
characterize the improvement in separation allowed by 
including the jet substructure variables. For a given sig- 
nal acceptance, we define the improvement to be the ratio 
of fake rates Tq/Tc+s, where Fc and Fc+S are the fake 
rates if the BDTs are optimized using the variables in 
D c and D, respectively. 

The top panel in Fig. [T] shows that the conventional 
variables already provide significant separation between 
single photons and QCD-jets. The substructure variables 
reduce the fake rate by an additional factor of 2.5 for 
a single photon acceptance of 80%, resulting in a total 
fake rate of about 1 in 10 4 . For larger acceptance val- 
ues the fake rate increases, but the improvement due the 
substructure variables also increases to a value above 4. 
The separation of single photons from the photon-jets, 
on the other hand, are entirely due to the jet substruc- 
ture variables as indicated in the bottom panel of Fig. [T] 
Comparing the 2 photon photon-jets (maroon) case with 
the 4 photon photon-jets (blue) indicates that single pho- 
tons can be separated more efficiently from the 4 photon 
photon-jet background than from photon-jets with 2 pho- 
tons. Having multiple photons inside the jet ensures that 
the energy in the jet is distributed in multiple subjets im- 
parting more substructure to the jet. We find that for 
single photon acceptances over 80%, we can obtain fake 
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FIG. 2. The BDT responses for QCD-jets (red), photons 
(green) and photon-jets (blue). The left (right) panel shows 
photon-jets containing 2 photons (4 photons). 

rates as low as 2 x 10 -4 for QCD-jets, 0.05 for 2 photon 
photon-jets, and 3 x 10~ 4 for 4 photon photon-jets. 

Figure [2] displays an example of three-way separation 
between single photons, photon-jets, and QCD-jets us- 
ing two BDTs. The first BDT is optimized to separate 
photon-jets from QCD-jets employing only the conven- 
tional (Dc) variables (and its response is plotted on the 
vertical axis). The second BDT is trained to separate 
photon-jets from single photons using only the jet sub- 
structure (Ds) variables (and its response is plotted on 
the horizontal axis). By construction the upper left cor- 
ner is primarily single photons, the upper right is primar- 
ily photon-jets, and QCD-jets tend to lie along the bot- 
tom axis. The left (right) panel corresponds to photon- 
jets with 2 (4) photons. In the two-dimensional space of 
the responses of these two BDTs, the numerical values 
associated with a given contour corresponds to the rela- 
tive probability to find a calorimeter object in a cell of 
size 0.1 x 0.1 in BDT response units, which range from 
-1 (background-like) to +1 (signal- like) . As indicated 
in Fig. [2j separating photons from 2 photon photon-jets 
remains challenging. A small fraction of the 2-photon 
photon-jet sample (of order few %), represented by the 
dashed blue contours in the upper-left corner, constitute 
an irreducible background to photons. A much cleaner 
separation (for photon vs. photon-jets) is observed for 4 
photon photon-jets. 

In this work we have demonstrated the efficacy of us- 
ing jet based techniques, including jet substructure vari- 
ables, to analyze and identify the full class of objects con- 
structed from the energy deposited in calorimeter cells. 
This class includes not only the familiar single photons 
and QCD-jets, but also the potentially very interesting 
(at the LHC) photon-jets (and lepton-jets). This ap- 
proach not only has the advantage of defining a universal 
language for all such objects, but also enhances the possi- 
ble analyses by allowing the inclusion of recent advances 
in jet substructure technology. Previous efforts to distin- 
guish these objects [43] have largely used variables that 
are constructed in the spirit of substructure techniques, 
but treating everything as a jet allows a much more direct 
employment of jet substructure variables and analyses. 
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As we have shown, these can be powerful tools for iden- 
tifying both single photons and photon-jets, separating 
them from QCD-jets and from each other. 
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